Phonetic Classification on Wide-Band and Telephone Quality Speech
نویسنده
چکیده
Benchmarking the performance for telephone-network-based speech recognition systems is hampered by two factors: lack of standardized databases for telephone network speech, and insufficient understanding of the impact of the telephone network on recognition systems. The N-TIMIT database was used in the experiments described in this paper in order to "calibrate" the effect of the telephone network on phonetic classification algofithrns. Phonetic classification algorithms have been developed for wide-band and telephone quality speech, and were tested on subsets of the TIMIT and N-TIMIT databases. The classifier described in this paper provides accuracy of 75% on wide-band TIM1T data and 66.5% on telephone quality N-TIMIT data. Overall the telephone network seems to increase the error rate by a factor of 1.3.
منابع مشابه
Wideband speech coding based on the MBE structure
This paper deals with the adaptation to wideband of the MBE coder which was initially developed for the telephone band. As the constraints of quality and bit rate for a wideband and a telephone band coder are different, and as the signal characteristics on these two bands are different too, we must reconsider the coder structure. Several improvements are proposed, some of which were already pro...
متن کاملClassification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech
This paper proposes a classification module for fricative consonants in telephone speech using an acoustic-phonetic feature extrapolation technique. In channel-deteriorated telephone speech, acoustic cues of fricative consonants are expected to be degraded or missing due to limited bandwidth. This paper applies an extrapolation technique to acoustic-phonetic features based on Gaussian mixture m...
متن کاملPhonetic Landmark Detection for Automatic Language Identification
This paper presents a method of augmenting shifted-delta cepstral coefficients (SDCCs) with the classification outputs of an array of support vector machines (SVMs) trained to detect a set of manner and place features on telephone speech. The SVM array allows for broad phoneme classification, and when this information is concatenated with SDCCs to form a hybrid feature vector for each acoustic ...
متن کاملClassification of emotional speech using spectral pattern features
Speech Emotion Recognition (SER) is a new and challenging research area with a wide range of applications in man-machine interactions. The aim of a SER system is to recognize human emotion by analyzing the acoustics of speech sound. In this study, we propose Spectral Pattern features (SPs) and Harmonic Energy features (HEs) for emotion recognition. These features extracted from the spectrogram ...
متن کاملSession 8B: Robust Speech Processing
Four papers are briefly reviewed. 1. The Papers This session consists of two types of papers. The first two, "Multiple approaches to robust speech recognition" and "Reduced channel dependence for speech recogni-tion" present computational methods for minimizing the acoustic and speaker differences in particular recogniz-ers. The third paper, " Experimental results for base-line speech recogniti...
متن کامل